Comparing N-Node Set Importance Representative results with Node Importance Representative results for Categorical Clustering: An exploratory study
نویسندگان
چکیده
The proportionate increase in the size of the data with increase in space implies that clustering a very large data set becomes difficult and is a time consuming process. Sampling is one important technique to scale down the size of dataset and to improve the efficiency of clustering. After sampling, allocating unlabeled objects into proper clusters is impossible in the categorical domain. To address the problem, Chen employed a method called MARDL to allocate each unlabeled data point to the appropriate cluster based on NIR and NNIR algorithms. This paper took off from Chen’s investigation and analyzed and compared the results of NIR and NNIR, leading to the conclusion that the two processes contradict each other when it comes to finding the resemblance between an unlabeled data point and a cluster. A new and better way of solving the problem was arrived at that finds resemblance between unlabeled data point within all clusters, while also providing maximal resemblance for allocation of data in the required cluster. International Journal of Computer Engineering Science (IJCES) Volume 2 Issue 8 (August 2012) ISSN : 2250:3439 https://sites.google.com/site/ijcesjournal http://www.ijces.com/ 2
منابع مشابه
An approach to deal with time- evolving Categorical data based on NIR using Clustering
Data clustering is an important technique for exploratory data analysis and has been the focus of substantial research in several domains for decades among which Sampling has been recognized as an important technique to improve the efficiency of clustering. However, with sampling applied, those points that are not sampled will not have their labels after the normal process. Although there is a ...
متن کاملOur - NIR : Node Importance Representative for Clustering of Categorical Data
The problem of evaluating node importance in clustering has been active research in present days and many methods have been developed. Most of the clustering algorithms deal with general similarity measures. However In real situation most of the cases data changes over time. But clustering this type of data not only decreases the quality of clusters but also disregards the expectation of users,...
متن کاملRegion Directed Diffusion in Sensor Network Using Learning Automata:RDDLA
One of the main challenges in wireless sensor network is energy problem and life cycle of nodes in networks. Several methods can be used for increasing life cycle of nodes. One of these methods is load balancing in nodes while transmitting data from source to destination. Directed diffusion algorithm is one of declared methods in wireless sensor networks which is data-oriented algorithm. Direct...
متن کاملRegion Directed Diffusion in Sensor Network Using Learning Automata:RDDLA
One of the main challenges in wireless sensor network is energy problem and life cycle of nodes in networks. Several methods can be used for increasing life cycle of nodes. One of these methods is load balancing in nodes while transmitting data from source to destination. Directed diffusion algorithm is one of declared methods in wireless sensor networks which is data-oriented algorithm. Direct...
متن کاملInappropriate cervical injection of radiotracer for sentinel node mapping in a uterine cervix cancer patient: importance of lymphoscintigraphy and blue dye injection
Herein, we report a case of sentinel lymph node mapping in a uterine cervix cancer patient, referring to the nuclear medicine department of our institute. Lymphoscintigraphy images showed inappropriate intra‐cervical injection of radiotracer. Blue dye technique was applied for sentinel lymph node mapping, using intra‐cervical injection of methylene blue. Two blue/cold sentinel lymph nodes, with...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1208.4809 شماره
صفحات -
تاریخ انتشار 2012